Convergence of Limited Communication Gradient Methods
نویسندگان
چکیده
منابع مشابه
Gradient Convergence in Gradient Methods
For the classical gradient method xt+1 = xt − γt∇f(xt) and several deterministic and stochastic variants, we discuss the issue of convergence of the gradient sequence ∇f(xt) and the attendant issue of stationarity of limit points of xt. We assume that ∇f is Lipschitz continuous, and that the stepsize γt diminishes to 0 and satisfies standard stochastic approximation conditions. We show that eit...
متن کاملGradient Convergence in Gradient methods with Errors
We consider the gradient method xt+1 = xt + γt(st + wt), where st is a descent direction of a function f : �n → � and wt is a deterministic or stochastic error. We assume that ∇f is Lipschitz continuous, that the stepsize γt diminishes to 0, and that st and wt satisfy standard conditions. We show that either f(xt) → −∞ or f(xt) converges to a finite value and ∇f(xt) → 0 (with probability 1 in t...
متن کاملConvergence Properties of Nonlinear Conjugate Gradient Methods
Recently, important contributions on convergence studies of conjugate gradient methods have been made by Gilbert and Nocedal [6]. They introduce a “sufficient descent condition” to establish global convergence results, whereas this condition is not needed in the convergence analyses of Newton and quasi-Newton methods, [6] hints that the sufficient descent condition, which was enforced by their ...
متن کاملQuantized Stochastic Gradient Descent: Communication versus Convergence
Parallel implementations of stochastic gradient descent (SGD) have received signif1 icant research attention, thanks to excellent scalability properties of this algorithm, 2 and to its efficiency in the context of training deep neural networks. A fundamental 3 barrier for parallelizing large-scale SGD is the fact that the cost of communicat4 ing the gradient updates between nodes can be very la...
متن کاملDistributed SAGA: Maintaining linear convergence rate with limited communication
In recent years, variance-reducing stochastic methods have shown great practical performance, exhibiting linear convergence rate when other stochastic methods offered a sub-linear rate. However, as datasets grow ever bigger and clusters become widespread, the need for fast distribution methods is pressing. We propose here a distribution scheme for SAGA which maintains a linear convergence rate,...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Transactions on Automatic Control
سال: 2018
ISSN: 0018-9286,1558-2523
DOI: 10.1109/tac.2017.2743678